Resilient secret sharing cloud based architecture for data vault

ABSTRACT

A method of securely storing data including: providing, within a secure data storage system, a plurality of secret sharing methods for selection and identifying a striping policy for storage of the data, in accordance with input preferences. The data can be split into N secret shares according to a secret sharing method, the selection being determined by the striping policy, wherein a threshold number, T, of such shares is sufficient to recover the data, where T is less than N, generating metadata associated with the data, the metadata identifying the selected secret sharing method and storing the metadata within the secure data storage system and writing the secret shares to storage that includes storage outside the secure data storage system, such that, when at least T shares are retrieved, the metadata can be recalled to identify the selected secret sharing method for recovery of the data.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of International Application No.PCT/GB2016/052009, filed Jul. 1, 2016, which claims the benefit of U.S.Provisional Application No. 62/188,058, filed Jul. 2, 2015, the entirecontents of which are fully incorporated herein by reference.

FIELD OF THE INVENTION

This invention relates to the secure storage of data.

BACKGROUND

Computing has witnessed a change from on-premises infrastructure toconvenient, on-demand network access to a shared pool of configurablecomputing resources that can be rapidly provisioned and released withminimal management effort or service provider interaction, also known asCloud Computing.

Cloud computing provides enterprises with benefits such as saving oncapital and operational costs, improving scalability and flexibility andreducing the carbon footprint. However, Cloud computing also presents anumber of disadvantages such as data security and reliability issues.

In Cloud computing, on-premises architectures within organisations havesimply been scaled-out into the Cloud, with the addition of encryption.This methodology has been shown to be weak from many aspects, especiallyrelated to: trusted administrator access; lack of proper access control;Advanced Persistent Threat (APT); and in the loss of private keys. Manysystems are often protected with symmetric key encryption methods, wherethe key is protected by a password or encrypted using public keyencryption. Along with this, anyone with System Administrator access cangain access to the encrypted content. The current encryption methods inthe Cloud often suffer where the loss of a single encryption key canresult in large-scale data loss.

Many organisations use the same methods of robustness and failover asthey do within their internal systems. With the Cloud, there is a riskof a major outage in parts of the Cloud resulting in denial of service.More severely, outage can cause business shut down as there is noalternative means of accessing data. Beyond this, the user's privacy isusually jeopardised as Cloud service providers cache, copy and archiveusers' data, which can easily be retrieved, used and misused bymiscreants, competitors or court of law even when the owner seems tohave deleted them.

U.S. Pat. No. 8,423,466 describes a transaction system that sits betweena bank or payment provider and a user and acts as a secure, trustedsystem for arranging payment once a transaction has been fulfilled andonly once the identities of both users have been authenticated andappropriate checks have been completed. The system allows a user totransact with merchants over numerous different channels, using a singleauthentication means to interact with the system, thereby to beauthenticated and arrange a payment, without having to reveal financialdetails to the merchant. The system provides multi-channel, consistentanti-fraud measures and validation services to users to ensure that theother users involved in the transaction are who they claim and aretransacting within allowed limits.

Since many systems have been breached by a compromise involving the lossof a private key, one method to overcome this problem is to use keylessencryption. In one example, keyless encryption involves breaking thedata into secret shares which can be distributed amongst those who havethe rights to the data. If any data elements are accessed, it will notbe possible to recover the original data until the other relevant sharesare available.

Secret sharing schemes have been proposed for data splitting andreconstruction, thereby providing data security in a keyless manner.Such algorithms include Adi Shamir's Perfect Secret Sharing Scheme(PSS), Hugo Krawczyk's Secret Sharing made short or Computational SecretSharing scheme (CSS) and Rabin's Information Dispersal Algorithm (IDA),among others. These algorithms break a secret into chunks called(T-out-of-N) threshold where N is the total number of shares and T isthe number required to recover the secret. Fewer than the thresholdnumber (T) of shares cannot recover the secret. The performance overheadof the different secret sharing schemes, at increasing thresholds andincreasing data sizes shows varied behaviours, and has restricted theadvancement of secret sharing schemes in use.

Another consideration taken into account when using a Cloud basedstorage system is ensuring that it is survivable. That is to say, theCloud based storage system is able to securely store criticalinformation and ensure that it persists, is continuously accessible,cannot be destroyed and is kept confidential. Survivable Cloud storagesystems entrust data to a set of Clouds. Relying on a single CloudStorage Provider (CSP) is subject to confidentiality and availabilityrisks. As such, the data should be fragmented and then distributed amongmultiple CSPs.

SUMMARY OF THE INVENTION

According to a first aspect of the invention, a method of securelystoring data is provided. The method comprises: providing, within asecure data storage system, a plurality of secret sharing methods forselection; identifying, a striping policy for storage of the data, inaccordance with input preferences; split the data into a plurality, N,of secret shares according to a selected one of the plurality of secretsharing methods, the selection being determined by the striping policy,wherein a threshold number, T, of such shares is sufficient to recoverthe data, where T is less than N; generate metadata associated with thedata, the metadata identifying the selected secret sharing method; storethe metadata within the secure data storage system; and write the secretshares to storage. The storage preferably includes storage outside thesecure data storage system. When at least T shares are retrieved, themetadata can be recalled to identify the selected secret sharing methodfor recovery of the data.

The secret sharing methods preferably include methods with relativelyhigh security but relatively low resilience and methods with relativelyhigh resilience but relatively low security and wherein the selection ofthe striping policy is based on preferences that are translated intosecurity and resilience preferences.

The secret sharing methods may include methods or algorithms withrelatively high T/N and relatively low T/N. An interface is provided toenable a user or administrator to input preferences that are translatedinto selection of a striping policy.

The secret sharing methods preferably include different secret sharingalgorithms selected from the group that includes: perfect secret sharingscheme (PSS); computational secret sharing (CSS); information dispersalalgorithm; and Reed-Solomon encoding combined with encryption.

The policy preferably translates a user preference for security,resilience and/or performance into a selection of method/algorithm.

Each share is preferably written to an independent store, at least someof which are outside the secure storage system, such as: a public cloud;a private cloud; a non-SQL data store; and a file server.

In accordance with a second aspect of the invention. A system forsecurely storing data is provided. The system comprises: a secretsharing module adapted to provide a plurality of secret sharing methodsfor selection, each method arranged to split the data into a plurality,N, of secret shares wherein a threshold number, T, of such shares issufficient to recover the data, where T is less than N; a policy moduleadapted to determine a policy for storage of the data, in accordancewith input preferences, wherein the method selected by the secretsharing module for splitting the data is determined by the policymodule; a metadata module for generating and storing metadata associatedwith the data, the metadata identifying the selected secret sharingmethod; and a memory and storage interface for writing the secret sharesto storage such that, when at least T shares are retrieved from storage,the metadata can be recalled to identify the selected secret sharingmethod for recovery of the data.

Also provided is a computer program product comprising program codewhich, when executed by a computer, causes the computer to perform theabove method.

In accordance with a further aspect of the invention, a method ofsecurely storing data is provided that comprises: fragmenting (otherwisereferred to as striping) the data into a plurality, N, of secret shares,typically of equal size, according to a secret sharing algorithm,wherein a threshold number, T, of such shares is sufficient to recoverthe data, where T is less than N; splitting each share into dataparticles of equal size; writing the particles to storage such that theparticles of each share are written to independent storage meanscorresponding to that share, each particle being identified only by anidentifier unique within its respective storage means. In this manner,loss of an independent storage means or loss of a particle within thatindependent storage means preferably results in loss of at most oneshare.

The method may comprising pre-storing particles of dummy data withineach storage means and/or may comprise performing a clean-up process foreach storage means, whereby particles that exist in the storage meansare identified as having expired. Particles of data and particles ofdummy or expired data preferably co-exist in the storage means.

The method may further comprise identifying a persistence policy forstorage of the data in accordance with input preferences, whereby a setof storage means is selected for storage of the data in accordance withthe persistence policy and/or in accordance with a sensitivity attributeassociated with the data. Some polices may include restrictions onattributes of the storage means that are to be selected to make up theset of storage means. Some polices may be defined for user selectionthat include different attributes for each of the storage means that areto be selected to make up the set of storage means. Such attributes mayinclude identifiers of storage providers and geographical locations ofthe storage means. Polices may include user latency preference and/ormay include duplication of one or more shares across plural independentstorage means and/or may include trustworthiness of the storage means.“Trustworthiness” is not merely an abstract concept in the mind of theuser—it may be defined in technical features such as by electronicallysigned certification, and/or may include challenge and response with acertification server.

The method preferably included monitoring the performance of eachstorage means for improvement of selection of storage means according topersistence policy, (e.g. adjusting the selection of storage means basedon performance in response to the monitoring).

In accordance with a further aspect of the invention, a secure storagesystem is provided comprising: an input interface for receiving data forstorage; a secret sharing module for fragmenting the data into aplurality, N, of secret shares of equal size according to a secretsharing algorithm, wherein a threshold number, T, of such shares issufficient to recover the data, where T is less than N; and apersistence module for splitting each share into data particles of equalsize and for writing the particles to storage such that the particles ofeach share are written to independent storage means corresponding tothat share, each particle being identified only by an identifier uniquewithin its respective storage means.

Also provided is a computer program product comprising program codewhich, when executed by a computer, causes the computer to: receive datafor storage; fragment the data into a plurality, N, of secret shares ofequal size according to a secret sharing algorithm, wherein a thresholdnumber, T, of such shares is sufficient to recover the data, where T isless than N; split each share into data particles of equal size; andwrite the particles to storage such that the particles of each share arewritten to independent storage means corresponding to that share, eachparticle being identified only by an identifier unique within itsrespective storage means.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a high level diagram illustrating, generically, elements of astorage system in accordance with embodiments of the invention.

FIG. 2 is a more detailed diagram of a storage system, referred to as asurvivable cloud storage system (SCSS) architecture.

FIG. 3 illustrates tiered software components of a further embodiment.

FIG. 4 illustrates operation of the system of FIG. 3.

FIG. 5 further illustrates, in hardware and software elements, certainaspects of the system of FIG. 3.

FIG. 6 is a process flow diagram illustrating operation of the system ofFIG. 5.

FIG. 7 further illustrates, in hardware and software elements, certainaspects of the system of FIG. 3.

FIG. 8 illustrates certain elements of an embodiment described in anappendix.

DETAILED DESCRIPTION

A portion of the disclosure of this patent document contains materialwhich is subject to copyright protection. The copyright owner has noobjection to the facsimile reproduction by anyone of the patent documentor the patent disclosure, as it appears in the Patent and TrademarkOffice patent file or records, but otherwise reserves all copyrightrights whatsoever.

FIG. 1 shows an architecture which supports a secret sharing scheme in amulti-cloud environment 100 can be viewed as having an applicationplatform 102 (having a secret sharing module that will be described), amain multi-cloud proxy server (with router) 104 and a metadata server106. The metadata server 106 is illustrated as being connected betweenthe application platform 102 and the main multi-cloud proxy server 104illustrating that metadata can be associated with data passing betweenthe application platform 102 and the main multi-cloud proxy server 104in each direction.

The function of the application platform 102 is to: determine accessstructure; encode secrets; send secrets to the main multi-cloud proxyserver 104 for distribution to multi-cloud service providers, andreconstruct the secret shares when recovered. The main multi-cloud proxyserver (with router) 104 splits and distributes encoded shares to themulti-cloud based on a pre-determined access structure and manages thefail-over protection of shares. The metadata server 106 includes thefunctionality of: user management; server management; sessionmanagement; policy management; and file metadata management.

The architecture may also have a multi-cloud proxy server for gatheringshares and reconstructing secrets as well managing break-glass datarecovery. There may be sub-Routers to create a path between a cloudservice provider (CSP) (considered here as front-end) with other cloudservice providers (considered here as the back-ends), thereby creating aquick and alternative recovery path for all the shares.

At the application platform 102, the data owner determines N and Tvalues and, using both, calls up the application to be used and selectsan algorithm of choice based on an evaluation after a successful sign-into the system (e.g. as described in U.S. Pat. No. 8,423,466), and anaccess level is determined. The values for N and T are not directlyselected by the user, but such values are prescribed for attributesselected by the user (e.g. “very secure,” “very resilient” etc.)Translation of selected attributes into a selected algorithm (with orwithout selected encryption) and parameters for that algorithm isautomated.

In addition to selection of algorithm and parameters based on userselected attributes (security, resilience, overhead cost), the choice ofalgorithm and encryption can further be based on data size andperformance and indeed performance for a given data size).

The selected algorithm may have a 3-out-of-5 access structure or a4-out-of-10 or a 2-out-of-5. The encoded data is sent to the local mainmulti-cloud proxy server with router 104 for onward dissemination to theCSPs. The proxy splits the encoded data according to a secret-sharingscheme determined access structure, and distributes each share over theInternet to different CSPs (or distributes some to CSPs and others tolocal/in-house storage).

The retrieval process is similar to the storage process as the metadataserver (106) helps to keep track of the siblings of the shares. Theproxy retrieves enough corresponding shares from the cloud serviceproviders. This retrieval involves authentication to the cloudproviders. The retrieved shares are sent back to the applicationplatform (102), which decodes them and verifies their authenticitybefore reconstructing the data. The system is capable of a break-glassdata recovery through the local multi-cloud proxy server in case ofemergency after which a clean-up should be performed at the end of theactivities for record purposes.

The design incorporates unique features in a multi-cloud environment asit uses secret sharing schemes to implement keyless encryption. This isdone by breaking the secret into chunks in such a manner that less thanT shares cannot recover the secret, thus using it for data distributionin object storage system. This is also used to implement safety destructwith equal divided shares. The incorporation of a self-destructivesystem solves the problem of cloud users' privacy, as there is no way auser's data can be accessed, copied, cached or used without the dataowner's consent within a pre-determined time-frame, because all data andtheir copies are destroyed or become unreadable after a user-specifiedtime, without any user intervention.

The self-destructive system defines two modules; a self-destruct methodobject; and survival time parameter for each secret key part. In thiscase, a secret sharing algorithm is used to implement share distributionin object storage system so as to ensure safe destruct with equallydivided shares. Based on active storage framework, object-based storageinterface will be used to store and manage the equal divided shares.

The use and implementation of threshold systems in cloud services aredeliberate acts towards implementing a failover protection in the model.In normal circumstances, all the service providers are used in sharestorage as well as secret reconstruction, but in an extreme desperatesituation, 2-out-of-5 can be made redundant. That is to say if2-out-of-5 CSPs fail, data/secret storage and reconstruction are stillpossible.

The use of a second local multi-cloud proxy server and sub-routers arefor the implementation of a break-glass data recovery. With thesub-routers and the second multi-cloud server, a route is established toand from all the CSPs. Having decided on a 3-out-of-5 access structure,only 3-out-of-5 CSPs are required to store and reconstruct the secret inan emergency situation. By this feature, the concept of total businessshut down or denial of service may not exist in using this model, thoughthe number of CSPs required is dependent on the secret sharing algorithmof choice in times of secret reconstruction.

A break-glass data recovery system can be implemented using one of theproxy servers. An access to the multi-Cloud proxy server entails accessto particular CSPs that provide access to all other CSPs (e.g. CSPs I, 3and 5 provide a link to CSPs 2 and 4). In this example, these aredifferent independent CSPs of the same or different storagearchitectures. The relationships are linked for redundancy but aremutually exclusive in terms of storage architectures.

Access to particular CSPs ensures a quick recovery of shares in order toreconstruct the secret as it is a quick link to all other CSPs.Moreover, following the access structure, such access ensures thepossibility of reconstructing the secret in an emergency situation. Thisis a useful feature, as there could be a period of cloud outage, and insuch situation, data recovery could be done from 3-out-of-5 Cloudservice providers being used for data storage. That is to say, if 2 outof the 5 cloud service providers fail, data recovery is still possiblein such an extreme condition.

The proposed architecture can provide the following:

1. fast and efficient data/key distribution to multi-cloud serviceproviders;

2. keyless encryption and therefore increased data security;

3. data owner's privacy by implementing a self-destructing data system(SeDaS), as it meets all the privacy-preserving goals;

4. support, through SeDaS, for securely erasing files and random storagein drives (Cloud, FIDD or SSD) respectively;

5. backup operational mode in which the function of 5 CSPs can beassumed by 3 CSPs when 2-out-of-the-5 CSPs become unavailable eitherthrough failure or scheduled down time; and

6. break-glass data recovery.

FIG. 2 shows a Survivable Cloud Storage system (SCSS) architecture 200.It is shown as having three parts—tier 3 middleware 202, tier 2middleware 204 and cloud data stores 206.

The tier 3 middleware 202 comprises a secret sharing module 214, a metadata module 218, a scheduler module 222, connected across interfaces 216and 220 and connected to a source of data 208 via interface 212 and tocloud I/O services 226 via interface 224. It has a policy engine 242coupled to each of elements 214, 218 and 222 via software interfaces(APIs) 236, 238 and 240 and has a degradation detection and recovery(DDR) module 234 coupled to metadata module 218 via software interface241.

The tier 2 middleware comprises a cloud proxies module 244 and a stickypolicy enforcement module 254 connected across interface 256. The cloudproxies module 244 is connected to the cloud I/O services 226 viainterface 243.

Sticky policies are described in general terms in Sticky Policies forData Control in the Cloud by Slim Trabelsi and Jakub Sendor, 2012 TenthAnnual International Conference on Privacy, Security and Trust, where itis explained that sticky policies are security and privacy constraintsthat are permanently “attached” to data. It is described that whensensitive information is sent to the cloud, it is stored with the stickypolicy attached to it (for storage of the sticky policy in the cloud).It is also described that an entity that wants to decrypt data needs tocomply with the sticky policy in order to receive a decryption tokenfrom a certification authority.

In the preferred embodiment of the present invention, all that is sentto the cloud attached to the data is sufficient information (e.g. an ID)to permit the system to identify the sticky policy that has been appliedand from which the data can be reconstructed when sufficient shares havebeen retrieved. I.e. the details of the sticky policy remain as metadatain the metadata module 218 while forever remaining associated with thedata. This fragmentation provides greater security of the overall securestorage system and obfuscates the details of the sticky policy,encryption method and other attributes related to any data fragmentstored in the cloud by the present invention

Cloud data stores 206 comprise public clouds 258 connected to the cloudproxies 244 via interface 246; private clouds 260 connected to the cloudproxies 244 via interface 248; NoSQL data stores 262 connected to thecloud proxies 244 via interface 250 and/or traditional file servers 264connected to the cloud proxies 244 via interface 252. Different ones ofthese different types of store may be available and used in differentcircumstances, as will be described. It is particularly useful, as willbe explained, to arrange that more than one type of data store is usedfor a particular set of shares of a shared secret.

The policy engine 242 is connected to configuration services 228 viainterface 235. The configuration services are connected to systemadministrators 232 via interface 268. The DDR module 234 is connected toa maintenance module 230 via interface 266. The maintenance module 230is connected to system administrators 232 via interface 270.

The design and implementation of an access control sub-system can bequite flexible, and largely depend on specific requirements from anapplication domain. Access control issues are assumed to be addressed ona higher level (e.g. as described in U.S. Pat. No. 8,423,466), ratherthan being an integral part of the SCSS architecture. In other words,the underlying data I/O services behave as a relying party of the accesscontrol sub-system, and expect data producers and consumers to presentvalid security tokens as proof of authorised data operations.

The tier 3 middleware 202 will now be described.

Data I/O Services 210 provide fundamental create, read, update anddelete (CRUD) operations 208 to data producers and consumers.Service-Oriented Architecture (SOA) is adopted to provide goodinteroperability that allows a wide range of clients developed ondifferent software and hardware platforms to store and retrieve any datafiles conveniently.

Data producers and consumers may access the data I/O services 210 inslightly different ways. A data producer is regarded as the owner of thedata files that it has previously stored in the SCSS architecture, andthus its CRUD operations on these files should be permitted right away.In this circumstance, the data I/O services 210 only need toauthenticate a data producer's identity using a security token issued bythe access-control sub-system. However, a data consumer must access thedata I/O services 210 via a policy enforcement point (PEP), whichguarantees that the data consumer has been authorised by the data ownerto carry out a CRUD operation over a certain file.

When the data I/O services 210 receive a new data file from a client, itassigns a unique ID to the file, registers the ownership, and splits thefile into multiple secret shares using the secret sharing module 214 (asshown by line 212). Then, a variety of meta-data will be generated bythe meta-data module 218, such as time-stamps, unique share IDs, andmappings from the share IDs to further tracking and managementinformation (as shown by line 216). It is noticeable that some of themeta-data is maintained by the meta-data module 218 internally, whilesome others will be attached to the shares themselves, i.e., the stickypolicies that will be handled by the tier-2 middleware 204 later on forshare life-cycle management purposes. Next, the shares are passed on tothe scheduler module 222 (as shown by line 220), which dynamicallydistributes the shares to Cloud data stores 206 through lower levelCloud I/O Services 226 (as shown by line 224). The sticky policyattached to the share/fragment may only relate to the unique ID for thatshare/fragment. The unique ID can then be used to locate theshare/fragment and access the rest of the metadata maintained in themeta-module 218. This ensures that the complete metadata cannot beaccessed by only having access to the share/fragment in the cloud datastores 206.

When the data I/O services 210 receive a reading request for a data file208, it firstly resolves the file ID into corresponding share IDs usingthe meta-data module 218 and looks up the tracking information for eachof the shares; secondly, it asks the scheduler module 222 to recoverthese shares from the cloud data stores 206—this operation willterminate when a sufficient number of shares have been collected; andlast, it reconstructs the original data file using the secret sharingmodule 214 and returns the file to the client.

The selection by the secret sharing module 214 of the correct sharedsecret algorithm is described below. The secret sharing module mayalternatively be referred to as a “crypto-fragmentation” module.

The processing of an updating request is similar to writing a new file208, while it is possible either to delete the old file, or to keep itfor versioning or auditing purposes. To process a delete request 208,the data I/O services 210 resolve the file ID into corresponding shareIDs, and then ask the scheduler module 222 to delete all or enough ofthe shares to obfuscate recreation from the cloud data stores 206 andrecalibrate indices related to dummy data.

Referring now to configuration services 228, the front end of theconfiguration services provides a graphical user interface for systemadministrators 232 to set up various runtime policies that control thebehaviours of the tier 3 software modules 202, as shown by lines 235 to240. The back end of the services is a policy engine 242, whichinterprets and enforces the policies in real-time. It is preferred thatthe configuration and maintenance services are segregated between thedifferent middleware tiers, for increased segregation of duties andsecurity.

The configuration policies dictate the following aspects of the SCSSarchitecture 200. For the secret sharing module 214, the policy definesthe secret sharing schemes that are supported by the system, the hashingand encryption functions to be used by each individual scheme, thethreshold T, and the total number of shares N. A policy may configurethe secret sharing module 214 to apply a single scheme with staticparameters to all the data files, or to apply a number of schemes withdynamic parameters flexibly so as to meet different applicationrequirements on security, reliability and performance.

For the meta-data module 218, the policy defines the types and levels ofmeta-data that the system should generate and maintain. For example, aconfiguration policy may demand of a comprehensive audit trail about allthe changes made to a certain file. As a result, the meta-data module218 would override the updating and deleting operations so as to keepall historical versions of the targeted file throughout its life-cycle.For the scheduler module 222, the policy defines the schedulingstrategies that are supported by the system. For example, whether thesystem should apply round-robin scheduling to optimise load-balancing,or apply Byzantine fault-tolerance scheduling to optimise dependability,or apply social trust scheduling to optimise performance.

Referring now to maintenance services 230, these facilitate systemadministrators to configure the degradation detection & recovery (DDR)module 234, as shown by line 266. The DDR Module 234 is concerned withthe integrity and retrievability of the secret shares that weredistributed to the Cloud data stores 206. It obtains share IDs andcorresponding tracking information from the Meta-Data Module 218 (asshown by line 241), and periodically challenges the Cloud data stores206 using a proof-of-retrievability (PoR) protocol (as shown by line225). In the case that a share was identified to be corrupted or lost,the DDR module 234 will inform the meta-data module 218, which in turngenerates a substitute share using the secret sharing module 214 anduploads the share using the scheduler module 222. A maintenance policyshould specify technical details about the PoR protocol, as well as theinterval for the DDR module 234 to carry out the checks.

Tier 2 middleware 204 will now be described.

The tier 2 middleware 204 implements CSP specific cloud proxies whichprovide lower level share-oriented CRUD operations and query functionsthrough consistent cloud I/O services interface, as shown by line 243.

Cloud proxies 244 provide both horizontal and vertical abstractions overa wide range of cloud data stores 206, as shown by lines 246 to 252.Horizontal abstraction refers to the compatibility with diversified CSPsunder different management and/or control (e.g. Microsoft™, Amazon™,Google™, Rackspace™, etc). A cloud proxy 244 instance serves as a clientof a CSP's proprietary API, and handles the input and output of secretshares efficiently. Vertical abstraction refers to the capability of acloud proxy 244 instance to utilise a CSP's storage services ondifferent levels appropriately. For example, the cloud proxy for WindowsAzure™ may store secret shares using the blob service, yet storeassociated meta-data, such as sticky policies, using the table service.This is because the blob service is more cost-effective, and the tableservice provides better performance on queries. Similarly, the cloudproxy for AWS may store secret shares in S3, yet store meta-data inDynamoDB™, and so on. Such optimisations should be carried out by acloud proxy 244 automatically, and be completely transparent to tier 3middleware 202. The sticky policies are preferably stored with theirassociations to secret shares within the SCSS 200, where performance,cost and latency benefits result.

Another component of tier 2 middleware 204 is the sticky policyenforcement (SPE) module 254, which is an independent software processthat constantly scans sticky policies of the secret shares and fulfilsthe security constraints, as shown by line 256. For example, the SPEmodule 254 deletes a secret share when it is expired according to thesticky policy.

Referring now to cloud data stores 206, the SCSS architecture 200 shallsupport as many types of cloud data stores as possible in order toprovide high flexibility, scalability, reliability andcost-effectiveness. Public clouds 258 and private clouds 260 can be usedin combination, and if necessary, the system can expand to include NoSQLdata stores 262 (e.g. Cassandra™ and Druid™), or even traditional fileservers 264. A dedicated cloud proxy needs to be implemented to bridge aparticular data store and the unified cloud I/O service interface.

FIG. 3 shows a resilient secret sharing cloud-based architecture for adata vault 300. The architecture provides a system that allows data tobe stored securely in a plurality of storage means 302. The systemcomprises a secret sharing module 304 (which may be regarded as anAnonymous and Distributed encryption Cloud Architecture—ADeCA™—engine);a persistence engine 306 (which may be referred to as ATLAS™); theplurality of storage means 302; a logging unit 308; a transaction dataunit 310; an authentication framework unit 312 with middleware apps 313;a web application unit 314 and a website unit 316. Each of these unitsis a module of software with or without its own independent hardware andthey interface as shown in FIG. 3 across interfaces (which may be APIs).

The secret sharing module 304 is coupled to a secret sharing policyengine 318 and a policy rules database 320. (318 and 320 couldalternatively be a single module). The persistence engine 306 is coupledto a persistence policy monitor 322 and a persistence data store 324.The authentication framework unit 312 and the web application unit 314can be as described in US patent U.S. Pat. No. 8,423,466 which is herebyincorporated in its entirety by reference. The website unit 316comprises a firewall 326, a reverse proxy 328 and a load balancer 330.

In operation (after authentication), data 332 is sent from themiddleware apps 313 to the secret sharing module 304 for splitting intosecret shares and storing. The secret sharing module 304 receivescontrol signals from secret sharing control 334 and also receives asecret sharing policy for the data from the secret sharing policy engine318 according to the policy rules database 320. The secret sharingmodule 304 then splits the data into secret shares 336 which areforwarded to the persistence engine 306.

The persistence engine 306 receives the secret shares from the secretsharing module and distributes the secret shares, according to thepersistence policy engine 322 and the persistence data store 324, via afirewall 338, to the plurality of storage means (e.g. cloud stores) 302.

FIG. 4 shows a more detailed implementation of the secret sharing module304. It is a secure data storage system that provides a plurality ofsecret sharing algorithms for selection. The secret sharing module 304is coupled to a plurality of storage means 302 and to a secret sharingpolicy engine 318 (not separately shown in FIG. 4) and policy rulesdatabase 320. The secret sharing policy engine 318 and the policy rulesdatabase 320 can provide an interface for the Data I/O Services 210 andthe Configuration Services 228 of FIG. 2.

The secret sharing algorithms allow data 332 to be split into aplurality, N, of secret shares N 402 a-402 n according to a selectedsecret sharing algorithm, such that a threshold number of shares, T, issufficient to recover the data, where T is less than N. Some examples ofsecret sharing algorithms include the Perfect Secret Sharing Scheme(PSS); Computation Secret Sharing (CSS); Information dispersal algorithm(IDA) and Reed-Solomon encoding with combined encryption. This list isnot exhaustive, and any secret sharing scheme may be used alone or incombination. The selection of N and T is determined by the stripingpolicy. The secret sharing policy engine 318 and the policy rulesdatabase 320 allow the administrator to set, and the user oradministrator to select, preferences which in turn select which secretsharing algorithm is to be used for the particular user or particulardata or other circumstances.

The secret sharing module 304 is able to identify a striping policyaccording to input preferences. The input preferences may be provided tothe policy rules database 320 by users or administrators throughconfiguration services 232 (FIG. 2).

The secret sharing policy engine 318 and the policy rules database 320allow a user or administrator to select a relatively high T/N ratio or arelatively low T/N ratio. A relatively high T/N ratio creates arelatively high security but relatively low resilience secret sharingalgorithm. A relatively low T/N creates a relatively low security butrelatively high resilience secret sharing algorithm, the selection ofT/N is translated into the selection of striping policy.

The secret sharing policy engine 318 and the policy rules database 320optionally allow the user or administrator to select or configure anencryption method such as Advanced Encryption Standards (AES) orBlowfish™. The encryption method can be used on the data or theplurality of secret shares. The encryption method is stored in thesecret sharing module 304.

The secret sharing module 304 uses preferences provided by the secretsharing policy engine 318 and the policy rules database 320 to split thedata into a plurality of secret shares 402 a-402 n. Where no specificselection or preference of policy or rules is chosen for data, thesecret sharing module, 304 will automatically apply a secret sharing andpersistence method from a default set of one or more of a plurality ofsecret sharing algorithms and encryption, based on parameters such asdata type, size, policy feedback from the persistence engine and otheroperating parameters related to the current efficiency of thearchitecture 314. This particular method provides a further applicationof a “zero knowledge” user approach to an SCSS (i.e. a system in whichthose who work on one part of the system have no knowledge of what ishappening in another part)

The secret sharing module 304 generates metadata associated with thedata 332 and the plurality of secret shares 402 a-402 n. In particular,the metadata comprises information according to the selected secretsharing algorithm and/or parameters that was/were used to create theplurality of secret shares 402 a-402 n. By “algorithm” is meant themethod of splitting the data (e.g. file) into a plurality of secretshares (e.g. PSS, CSS, IDA, Reed-Solomon coding with encryption, etc.)and by “parameters” are meant at least the values N and T. The term“method” will be used generically to encompass different algorithms anddifferent implementations of an algorithm with different parameters.

The metadata generated in the secret sharing module 304 attaches to theshares. Metadata stored in the middleware apps 313 and policy engine 318may also include policy rules for access control purposes such as whichshareholders are regarded as the owners of which shares and in whatcircumstances they are allowed to retrieve the shares. Additionally, theADeCA™ engine generates an identifier and attaches the identifier to theplurality of secret shares.

The encryption method, the metadata and the identifier are collectivelyknown as sticky policies.

The secret sharing module 304 writes the plurality of secret shares 402a-402 n to the plurality of storage means 302. The plurality of storagemeans 302 are outside of the secure storage system of the secret sharingmodule 304. The plurality of storage 302 means may include a publiccloud 258, a private could 260, a non-SQL data store 262 and a fileserver 264. The plurality of storage means may be referred to as theMulti-Cloud. The cloud storage means 302 may alternatively be referredto as the cloud service providers (CSPs).

It is preferred that each of the plurality of storage means is anindependent storage means, each having a different address (e.g. URL orURI). They may also have a separate set of virtual machines that aremanaged to different interfaces. They may be independently addressable,may not have the same published endpoints, may be provided by differentcloud service providers, may have different underlying technology and/ormay be in a different geographic location.

In a preferred embodiment, the secret sharing module 304 writes a singlesecret share of the plurality of secret shares 402 a-402 n to a singlestorage means of the plurality of storage means 302. In an alternativeembodiment, the ADeCA™ engine writes more than one but less than Tsecret shares of the plurality of secret shares 402 a-402 n to a singlestorage means of the plurality of storage means 302.

The secret sharing module 304 is synonymously referred to as the ADeCA™data vault and may encompass the application platform 102, the mainmulti-cloud proxy server with router 104 and the meta-data module 106.

To retrieve the data after it has been stored, the secret sharing module304 uses the identifier to retrieve the at least T shares of theplurality of secret shares 402 a-402 n from the plurality of storagemeans 302. Then the secret sharing module 304 uses the stored metadatato recreate the data from the retrieved secret shares.

Appendix 1 shows a detailed implementation of the secret sharing moduleusing Java™.

Operation of the persistence engine 306 and the persistence policyengine 322 of FIG. 5 is now described.

Data 332 is fragmented into a plurality, N, of secret shares 402 a-402 nof equal size according to a secret sharing algorithm, wherein athreshold number, T, of such shares is sufficient to recover the data. Tis less than N.

The persistence engine splits each of the plurality of secret shares 402a-402 n into p data particles 404 aa-404 np. The data particles arepreferably of equal size to ensure that each data particle is anonymousrelative to each other. The size of the data particles 404 aa-404 np isdetermined by the administrator preferences. Alternatively, the size ofthe data particles is determined by computational limitations, such asavailable storage and bandwidth. In a further alternative embodiment,the size of the data particles is determined by a combination on inputpreferences and computational limitations.

The user or administrator preferences and computational limitations areidentified by the persistence policy engine 322.

The persistence engine 306 adds an identifier to each of the dataparticles 404 aa-404 np. The identifier enables the persistence engine306 to keep a track of where each of the data particles 404 aa-404 np islocated. The identifier is stored in a secure storage system onlyaccessible by the persistence engine 306. In a further embodiment, thissecure storage system could be stored recursively by another SCSS.

The persistence engine 306 writes the data particles to a plurality ofstorage means 302. Preferably, all data particle 404 xa-404 xp of ashare x are written to an independent storage means corresponding tothat share, and data particles of different shares are written todifferent independent storage means. In this way, if an independentstorage means is lost or a data particle within that independent storagemeans is lost, the result would mean a loss of one share at the most. Indoing this, the persistence engine 306 ensures that data processedthrough the architecture 300 is not vulnerable to loss or compromise ofany single CSP or independent storage means.

Each storage means is “independent” in that it is specific to a group ofparticles comprising a share.

Since the data particles 404 aa-404 np are anonymous relative to eachother, a hacker entering the independent storage means 302 would not beable to identify one particle from another. Much less, the hacker wouldnot be able to determine which data particles within the store havesensitive information, or which form a set 404 xa-404 xp that maytogether have sensitive information.

The independent storage means may pre-store particles of dummy datawhereby the data particles 404 xa-404 xp in a store co-exist withparticles of dummy data. By so doing, the data particles are furtherobfuscated in the independent storage means 302.

The persistence engine 306 may perform a clean-up process for each ofthe plurality of storage means 302. The data particles are given a settimer and once the timer has finished the data particles are expired.The expired data particles then become particles of dummy data andco-exist with the data particles 404 a-404 n. This obfuscates the dataparticles 404 a-404 n without the need to generate further particles ofdummy data.

The persistence policy engine 322 may identify a persistence policy inaccordance with input preferences. A set of the plurality of storagemeans is then selected in accordance with the persistence policy.

In an embodiment, the persistence policy is identified in accordancewith the sensitivity of data. This ensures that highly sensitive data isstored in the most secure storage means.

The persistence policies are defined by user or administratorpreferences. For example, the user or administrator preferences mayinclude a set of storage means that are not to be used. Alternatively,the user or administrator preference may include a set of storage meansthat are preferred. The user or administrator preferences may be basedon attributes relating to the plurality of storage means 302. Theattributes may include identifiers of storage providers and geographicallocations.

The persistence policy may include a latency preference. The persistencepolicy may also include duplication of the plurality of shares acrossthe plurality of storage means.

The performance of the plurality of storage means 302 is monitoredaccording to the persistence policies. The set of storage means 302 usedto store data particles 404 aa-404 np is then adjusted based on theperformance of the plurality of storage means 302.

The selection of persistence policy includes a measure of thetrustworthiness of the plurality of storage means 302.

FIG. 5 shows a more detailed implementation of the persistence engine306. It comprises the persistence engine 306, the persistence policyengine 322 (this is also referred to as persistence control), theplurality of storage means 302, and persistence storage meansinformation databases 502. The latter comprises a persistenceinformation database related to shares, cloudlet locations and policyrules 506 and a share set data database 508. A share tracking database504 is provided that has information for tracking shares across aplurality of different storage architectures.

In operation, the persistence engine 306 receives a plurality of secretshares 402 a-402 n and splits them into data particles 404 aa-404 npthat are anonymous relative to each other. The persistence engine 306also receives the user or administrator preferences and computationallimitations from the persistence policy engine 322. The persistencepolicy engine 306 then adds an identifier to each of the data particles404 aa-404 np. The identifier enables the persistence engine 306 to keepa track of where each of the data particles 404 aa-404 np is located.The identifier is stored in the persistence storage means informationdatabase 502. The persistence engine 306 then stores the data particles404 aa-404 np in the plurality of storage means 302.

The plurality of storage means 302 is determined by the user oradministrator's preferences. In an embodiment, the identifier is storedwith the corresponding data particle.

FIG. 6 is a process flow diagram showing operation of the persistenceengine 306 of FIG. 5.

At step 602, the persistence engine 306 creates a persistence ID. Thepersistence ID includes information provided by the persistence policyengine 322, such as the user or administrator preferences. At step 604,the persistence engine 306 receives a “put” message, which tells thepersistence engine to store data in to the plurality of storage means302. At step 606, the persistence engine 306 retrieves information fromdatabase 506 on the plurality of storage means 302 to be used. Thisallows suitable cloudlets to be selected and thus ensures that the datato be stored in the storage means 302 is stored correctly, in accordancewith preferred policy.

At step 608, the persistence engine 306 sends a cloudlet controller 510a-510 n information on how to store the secret shares 336. At step 610,the secret shares 336 are sent to the cloudlets 302. The cloudletcontrollers 510 a-510 n write the shares to the plurality of storagemeans 302 according to information provided at step 608.

If necessary, at step 612, the cloudlet controllers 510 a-510 n retrystoring any secret shares 336 that failed to store at a first attempt.This involves returning to step 606, whereupon the persistence engine306 re-tries to write the secret shares 402 a-402 n to the same storagemeans 302 or the persistence engine 306 may attempt to write the secretshares 402 a-402 n to an alternative storage means 302. A retry attemptcan be at the particle level if storage of only certain particlesfailed. This process is repeated until all the data particles 404 aa-404np are written to the data storage means 302.

At step 614, the cloudlet controllers 510 a-510 n send feedback messages514 to the persistence engine 306. Each storage means 302 sends asuccess message if the shares 336 were written to the storage means 302and sends a fail message if the shares 336 were not written to thestorage means 302.

At step 616, the shares 336 are deleted in the persistence engine 306.

Steps 618 and 620 are optional steps that relate to logging module 308(of FIG. 3). In step 618, data relating to the complete share is writtento blob from tracking. In step 620 tracking data for the share isremoved.

FIG. 7 shows a detailed implementation of the persistence engine 700that allows the cloudlets and storage means 302 to feedback performancelevel to the persistence engine.

The system shows persistence orchestrator 702 coupled to persistencepolicy engine 322. The persistence orchestrator is further connected tothe cloudlet controller 510, a retry requests module 704, the share setdata database 508, the share tracking data database 504 and optionally afeedback module 516. The system further comprises a persistence listener706, which is coupled to the retry requests module 704, the cloudletdata database 506 and the share set database 508, the share trackingdata database 504 and the cloudlet feedback module 514.

The cloudlet controller 510 is connected to cloudlet workers 512 a-n,the secret shares module 336 and a share data database 710. The cloudletworkers are connected to the storage means 302. A cloudlet policymonitor 708 connects the storage means 302 with the cloudlet feedbackmodule 514. Cloudlet workers and cloudlets are asynchronous. Onecloudlet worker is spawned per share to store that share and, duringretrieve, a cloudlet worker retrieves one share per cloudlet. Thecloudlet controller 510 operates on a set of shares. It waits for Tshares for a particular identifier to arrive back from the cloudletworkers.

The shared secret policy engine 242 (FIG. 2)/318 (FIG. 3) and thepersistence policy engine 322 have interdependent common attributes.This may be extended to include the shared secret policy engine 318 andshared secret policy rules 320, and persistence policy engine 322,indicating the interdependency of administrator and end user policyinteraction. I.e. administrator policies and end user policies can beimplemented in either the shared secret policy engine 318/shared secretpolicy rules 320, or the persistence policy engine 322 or both and thesemay be interdependent.

For example, a user/administrator may seek to achieve a certain level ina 3-dimensional space of (a) resilience, (b) security and (c)performance (each on a scale of minimum to maximum or 1 to 10 or 0 to100). Such a level is converted into a policy for the shared secretpolicy engine 318/shared secret policy rules 320 and a policy for thepersistence policy engine 322, but if the latter (for example) is unableto achieve the desired level of performance, this may lead to adjustmentof the persistence policy or may lead to adjustment of the shared secretpolicy engine (e.g. if performance takes priority in the overall policy)and lead to compromise in one of the other dimensions (resilience andsecurity). Alternatively, if one of the other dimensions (e.g. security)takes priority, this may lead to compromise in performance in the sharedsecret engine and/or the persistence engine.

Note that a policy may include obligations (mandatory requirements) andpreferences. A preference at a maximum level (10 or 100) may beconstrued as “mandatory” while a lower preference is non-mandatory.

For example, a healthcare application may require high security andmedium performance, but may involve file sizes of 100s of Mbytes. It maytypically be possible to handle such a large file in the shared secretmodule to the satisfaction of the shared secret policy, but when passedto the persistence module, the large file size may create performanceissues for the persistence engine at that level of security and, in sucha case, the persistence policy engine may instruct the shared secretpolicy engine to adjust its policy and (for example) increase the numberN. Alternatively, the persistence policy engine may seek to store thedata on a certain type of storage in order to achieve a certain level ofsecurity but may have to modify that policy because the preferredstorage will not meet the performance goals (or compromise onperformance in order to meet the security goals.

Thus, policies may include prioritization of security versus resilienceversus performance in a three-dimensional model.

The shared secret (ADeCA™) engine and the persistence engine (ATLAS™)can be used together to boost the security of data. In an embodiment,the ADeCA engine uses a secret sharing scheme that allows the data to besplit into a maximum of 50 shares. If the data is very sensitive thenthe policy will then fragment each of the 50 shares into up to 250anonymous and equal data particles before storing the data particlesinto up to 20 independent storage means with relatively high security.The benefit of this embodiment is that the data is secure. In analternative scenario, the data is deemed to be not very sensitive butmust be able to be retrieved quickly. In this case, the shared secretengine may split the data into 20 shares and the persistence engine mayfragment the 20 shares into 100 fragments and then store the fragmentsin 20 independent storage means with relatively low security.

Certain policies can be implemented only by the persistence engine (e.g.geographical, ownership or other restrictions on which cloudlets may beselected for a particular application/usage case). Meeting such policiesby the persistence engine may require adjustment of other policies bythe persistence policy engine and/or the shared secret policy engine toachieve other aspects of resilience/performance.

Secret sharing schemes have been proposed for data splitting andreconstruction, thereby providing data security in a keyless manner.This section outlines three of the main contenders for secret sharingschemes in cloud-based systems. They are Adi Shamir's Perfect SecretSharing Scheme (PSS), Hugo Krawczyk's Secret Sharing made short orComputational Secret Sharing scheme (CSS) and Rabin's InformationDispersal Algorithm (IDA). The performance overhead of the three secretsharing schemes, at increasing thresholds and increasing data sizesshows varied behaviours. The varied behaviours depict the secret sharingschemes strengths and weaknesses at different application scenarios.

It is useful to know the implication variance in data size has on theperformance of each secret sharing scheme (SSS) algorithm in terms ofshare creation and share recreation in case one wants to apply any incloud-based designs.

Data sizes from 1024 KB to 16,384 KB were evaluated. The data generatedare arbitrary due to the fact that the evaluations are not catered forin relation to one specific area where SSS algorithms may be applied in.The test machine is a D-Series 3 specification Microsoft Azure™ virtualmachine which consists of 4 vCores, 14 GB of RAM and a 200 GB SSD.

Two primary sets of results were presented which use the parameters ofN=5; T=2 and N=10; T=4. The variable N relates to the number of sharesto create while the variable T relates to the number of shares requiredfor recreation of the original arbitrary data (using each SSSalgorithm). It is found that IDA is the fastest algorithm regardless ofdata size. CSS comes second in terms of time taken for share creationand recreation, while PSS comes last. One significant observation in theresults is that PSS demonstrates greater issues in regards toscalability as the data size increases in comparison with the other twoalgorithms. Additionally, as we increase the parameters from N=5; T=2 toN=10; T=4, it can be demonstrated that only share creation will producesignificant increase in performance time.

Although IDA has demonstrated the fastest time in test results, in thiscontext it would be naive to simply use this algorithm from theseresults alone. Depending on the context and application, there may be aneed to strike a balance between ensuring strong security and acceptablelevel of performance. Thus, ultimately, the decision on which SSSalgorithm to use will be most dependent on the use-case scenario athand.

1. A method of securely storing data comprising: providing, within asecure data storage system, a plurality of secret sharing methods forselection; identifying a striping policy for storage of the data, inaccordance with input preferences; splitting the data into a plurality,N, of secret shares according to a selected one of the plurality ofsecret sharing methods, the selection being determined by the stripingpolicy, wherein a threshold number, T, of such shares is sufficient torecover the data, where T is less than N, generating metadata associatedwith the data, the metadata identifying the selected secret sharingmethod and storing the metadata within the secure data storage system;writing the secret shares to storage that includes storage outside thesecure data storage system, such that, when at least T shares areretrieved, the metadata can be recalled to identify the selected secretsharing method for recovery of the data.
 2. The method of claim 1,wherein the secret sharing methods include methods with relatively highsecurity but relatively low resilience and methods with relatively highresilience but relatively low security and wherein the selection of thestriping policy is based on preferences that are translated intosecurity and resilience preferences.
 3. The method of claim 1, whereinthe secret sharing methods include methods with relatively high T/N andrelatively low T/N and wherein an interface is provided to enable a useror administrator to input preferences that are translated into selectionof a striping policy.
 4. The method of claim 1, wherein the secretsharing methods include different secret sharing algorithms selectedfrom the group that includes: perfect secret sharing scheme (PSS);computational secret sharing (CSS); information dispersal algorithm; andReed-Solomon encoding combined with encryption.
 5. The method of claim1, wherein the policy translates a user preference for security,resilience and/or performance into a selection of method.
 6. The methodof claim 1, wherein each share is written to an independent store, atleast some of which are outside the secure storage system.
 7. The methodof claim 1, wherein the independent stores comprise a plurality ofstores selected from the group that includes: a public cloud; a privatecloud; a relational data store; a non-SQL data store; and a file server.8. A system for securely storing data comprising: a secret sharingmodule adapted to provide a plurality of secret sharing methods forselection, each method arranged to split the data into a plurality, N,of secret shares wherein a threshold number, T, of such shares issufficient to recover the data, where T is less than N; a policy moduleadapted to determine a policy for storage of the data, in accordancewith input preferences, wherein the method selected by the secretsharing module for splitting the data is determined by the policymodule; a metadata module for generating and storing metadata associatedwith the data, the metadata identifying the selected secret sharingmethod; and a memory interface for writing the secret shares to storagesuch that, when at least T shares are retrieved from storage, themetadata can be recalled to identify the selected secret sharing methodfor recovery of the data.
 9. A computer program product or productscomprising program code which, when executed by a computer or aplurality of interconnected computers, causes the computer(s) to:receive data for secure storage; identify a striping policy for storageof the data, in accordance with input preferences; split the data into aplurality, N, of secret shares according to a selected one of theplurality of secret sharing methods, the selection being determined bythe striping policy, wherein a threshold number, T, of such shares issufficient to recover the data, where T is less than N; generate andstore metadata associated with the data, the metadata identifying theselected secret sharing method; write the secret shares to storage;retrieve at least T shares; recall the metadata; identify the selectedsecret sharing method from the metadata; and recover the data.
 10. Amethod of securely storing data comprising: fragmenting the data into aplurality, N, of secret shares of equal size according to a secretsharing algorithm, wherein a threshold number, T, of such shares issufficient to recover the data, where T is less than N, splitting eachshare into data particles of equal size; writing the particles tostorage such that the particles of each share are written to independentstorage means corresponding to that share, each particle beingidentified only by an identifier unique within its respective storagemeans.
 11. The method of claim 10 further comprising pre-storingparticles of dummy data within each storage means, whereby the particlesof data and particles of dummy data co-exist in the storage means. 12.The method of claim 10, further comprising performing a clean-up processfor each storage means, whereby particles that exist in the storagemeans are identified as having expired, whereby particles of data andparticles of expired data co-exist in the storage means.
 13. The methodof claim 10, further comprising identifying a persistence policy forstorage of the data in accordance with input preferences, whereby a setof storage means is selected for storage of the data in accordance withthe persistence policy.
 14. The method of claim 10, further comprisingidentifying a persistence policy for storage of the data in accordancewith a sensitivity attribute associated with the data.
 15. The method ofclaim 13, wherein polices are defined for user selection that includerestrictions on attributes of the storage means that are to be selectedto make up the set of storage means.
 16. The method of claim 13, whereinpolices are defined for user selection that include different attributesfor each of the storage means that are to be selected to make up the setof storage means.
 17. The method of claim 16 wherein the attributesinclude identifiers of storage providers and geographical locations ofthe storage means.
 18. The method of claim 13, wherein polices aredefined for user selection that include user latency preference.
 19. Themethod of claim 13, wherein polices are defined for user selection thatinclude duplication of one or more shares across plural independentstorage means.
 20. The method of claim 13, wherein polices are definedfor user selection that include trustworthiness of the storage means.21. The method of claim 10, further comprising monitoring theperformance of each storage means for improvement of selection ofstorage means according to persistence policy.